Search CORE

345 research outputs found

New secondary curriculum: vision into practice - leadership case studies

Author: Barnes Iain
Flintham Alan
Matthews Alison
Publication venue: National College for Leadership of Schools and Children's services
Publication date: 01/01/2010
Field of study

A Mouth Full of Words: Visually Consistent Acoustic Redubbing

Author: Matthews Iain
Taylor Sarah
Theobald Barry-John
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 06/08/2015
Field of study

This paper introduces a method for automatic redubbing of video that exploits the many-to-many mapping of phoneme sequences to lip movements modelled as dynamic visemes [1]. For a given utterance, the corresponding dynamic viseme sequence is sampled to construct a graph of possible phoneme sequences that synchronize with the video. When composed with a pronunciation dictionary and language model, this produces a vast number of word sequences that are in sync with the original video, literally putting plausible words into the mouth of the speaker. We demonstrate that traditional, one-to-many, static visemes lack flexibility for this application as they produce significantly fewer word sequences. This work explores the natural ambiguity in visual speech and offers insight for automatic speech recognition and the importance of language modeling

University of East Anglia digital repository

The Effect of Speaking Rate on Audio and Visual Speech

Author: Matthews Iain
Taylor Sarah
Theobald Barry-John
Publication venue
Publication date: 29/07/2014
Field of study

The speed that an utterance is spoken affects both the duration of the speech and the position of the articulators. Consequently, the sounds that are produced are modified, as are the position and appearance of the lips, teeth, tongue and other visible articulators. We describe an experiment designed to measure the effect of variable speaking rate on audio and visual speech by comparing sequences of phonemes and dynamic visemes appearing in the same sentences spoken at different speeds. We find that both audio and visual speech production are affected by varying the rate of speech, however, the effect is significantly more prominent in visual speech

Crossref

University of East Anglia digital repository

Hand Keypoint Detection in Single Images using Multiview Bootstrapping

Author: Joo Hanbyul
Matthews Iain
Sheikh Yaser
Simon Tomas
Publication venue
Publication date: 25/04/2017
Field of study

We present an approach that uses a multi-camera system to train fine-grained detectors for keypoints that are prone to occlusion, such as the joints of a hand. We call this procedure multiview bootstrapping: first, an initial keypoint detector is used to produce noisy labels in multiple views of the hand. The noisy detections are then triangulated in 3D using multiview geometry or marked as outliers. Finally, the reprojected triangulations are used as new labeled training data to improve the detector. We repeat this process, generating more labeled data in each iteration. We derive a result analytically relating the minimum number of views to achieve target true and false positive rates for a given detector. The method is used to train a hand keypoint detector for single images. The resulting keypoint detector runs in realtime on RGB images and has accuracy comparable to methods that use depth sensors. The single view detector, triangulated over multiple views, enables 3D markerless hand motion capture with complex object interactions.Comment: CVPR 201

arXiv.org e-Print Archive

Crossref

Joint Learning of Facial Expression and Head Pose from Speech

Author: Greenwood David
Laycock Stephen
Matthews Iain
Publication venue: 'International Speech Communication Association'
Publication date: 01/09/2018
Field of study

Crossref

University of East Anglia digital repository

Predicting Head Pose from Speech with a Conditional Variational Autoencoder

Author: Greenwood David
Laycock Stephen
Matthews Iain
Publication venue: 'International Speech Communication Association'
Publication date: 20/08/2017
Field of study

Natural movement plays a significant role in realistic speech animation. Numerous studies have demonstrated the contribution visual cues make to the degree we, as human observers, find an animation acceptable. Rigid head motion is one visual mode that universally co-occurs with speech, and so it is a reasonable strategy to seek a transformation from the speech mode to predict the head pose. Several previous authors have shown that prediction is possible, but experiments are typically confined to rigidly produced dialogue. Natural, expressive, emotive and prosodic speech exhibit motion patterns that are far more difficult to predict with considerable variation in expected head pose. Recently, Long Short Term Memory (LSTM) networks have become an important tool for modelling speech and natural language tasks. We employ Deep Bi-Directional LSTMs (BLSTM) capable of learning long-term structure in language, to model the relationship that speech has with rigid head motion. We then extend our model by conditioning with prior motion. Finally, we introduce a generative head motion model, conditioned on audio features using a Conditional Variational Autoencoder (CVAE). Each approach mitigates the problems of the one to many mapping that a speech to head pose model must accommodat

Crossref

University of East Anglia digital repository

Audio-to-Visual Speech Conversion using Deep Neural Networks

Author: Kato Akihiro
Matthews Iain
Milner Ben
Taylor Sarah
Publication venue: 'International Speech Communication Association'
Publication date: 29/08/2016
Field of study

We study the problem of mapping from acoustic to visual speech with the goal of generating accurate, perceptually natural speech animation automatically from an audio speech signal. We present a sliding window deep neural network that learns a mapping from a window of acoustic features to a window of visual features from a large audio-visual speech dataset. Overlapping visual predictions are averaged to generate continuous, smoothly varying speech animation. We outperform a baseline HMM inversion approach in both objective and subjective evaluations and perform a thorough analysis of our results

Crossref

University of East Anglia digital repository

ADEnosine testing to determine the need for Pacing Therapy with the additional use of an Implantable Loop Recorder (ADEPT-ILR)

Author: Matthews Iain Graeme
Publication venue: Newcastle University
Publication date: 01/01/2018
Field of study

MD ThesisAim: To determine the efficacy of permanent pacing in preventing syncopal episodes in patients with unexplained syncope and a positive adenosine test via a randomised double-blind placebo-controlled crossover trial with an accompanying negative adenosine test implantable loop recorder arm. Methods: Individuals presenting to secondary care with unexplained syncope underwent adenosine testing as defined by the European Society of Cardiology. Those with a positive test had a permanent pacemaker implant and were randomised to pacemaker on or off for 6 months before crossing over to the alternative mode. Those with a negative adenosine test underwent a loop recorder implantation. The primary outcome was cumulative syncope burden as reported by monthly syncope diaries. Results: Fifty-two patients were included in the trial and had adenosine testing. There were 35 positive adenosine tests (67%) and 17 negative adenosine tests (33%). There was a mean of 0.4 fewer syncopal episodes per patient during the pacemaker on period compared to the pacemaker off period (1.2 vs. 1.6 episodes) with a higher relative risk of syncope in the pacemaker off period compared with the pacemaker on (RR 2.1, 95% CI 1.0 to 4.4, p=0.048). In the adenosine negative arm, one patient developed bradycardia requiring permanent pacing, giving a negative predictive value of the adenosine test for identifying a bradycardia pacing indication of 0.94 (95% CI 0.69 to 1.0). Conclusion: Permanent pacing reduces the syncope burden in patients with unexplained syncope and a positive adenosine test, whilst a high negative predictive value demonstrates the low likelihood of a missed opportunity for pacemaker implantation. Our study suggests that a positive adenosine test unmasks bradycardia pacing indications without the need for prolonged and invasive investigations, providing opportunity for early and effective intervention

Newcastle University eTheses

Environment, alcohol intoxication and overconfidence: evidence from a lab-in-the-field experiment

Author: Long Iain
Matthews Kent
Sivarajasingam Vaseekaran
Publication venue: 'Wiley'
Publication date: 04/05/2023
Field of study

Alcohol has long been known as the demon drink; an epithet owed to the numerous social ills it is associated with. Our lab-in-the-field experiment assesses the extent to which changes in intoxication and an individual's environment lead to changes in overconfidence or cognitive ability that are, in turn, often linked to problematic behaviours. Results indicate that it is the joint effect of being intoxicated in a bar, rather than simply being intoxicated, that matters. Subjects systematically underestimated the magnitude of their behavioural changes, suggesting that they cannot be held fully accountable for their actions

Online Research @ Cardiff